histogram data
Graph Attention-Driven Bayesian Deep Unrolling for Dual-Peak Single-Photon Lidar Imaging
Choi, Kyungmin, Koo, JaKeoung, McLaughlin, Stephen, Halimi, Abderrahim
Single-photon Lidar imaging offers a significant advantage in 3D imaging due to its high resolution and long-range capabilities, however it is challenging to apply in noisy environments with multiple targets per pixel. To tackle these challenges, several methods have been proposed. Statistical methods demonstrate interpretability on the inferred parameters, but they are often limited in their ability to handle complex scenes. Deep learning-based methods have shown superior performance in terms of accuracy and robustness, but they lack interpretability or they are limited to a single-peak per pixel. In this paper, we propose a deep unrolling algorithm for dual-peak single-photon Lidar imaging. We introduce a hierarchical Bayesian model for multiple targets and propose a neural network that unrolls the underlying statistical method. To support multiple targets, we adopt a dual depth maps representation and exploit geometric deep learning to extract features from the point cloud. The proposed method takes advantages of statistical methods and learning-based methods in terms of accuracy and quantifying uncertainty. The experimental results on synthetic and real data demonstrate the competitive performance when compared to existing methods, while also providing uncertainty information.
Non-linear Metric Learning
How to compare examples is a fundamental question in machine learning. If an algorithm could perfectly determine whether two examples were semantically similar or dissimilar, most subsequent machine learning tasks would become trivial (i.e., a nearest neighbor classifier will achieve perfect results). Guided by this motivation, a surge of recent research [10, 13, 15, 24, 31, 32] has focused on Mahalanobis metric learning. The resulting methods greatly improve the performance of metric dependent algorithms, such as k-means clustering and kNN classification, and have gained popularity in many research areas and applications within and beyond machine learning. One reason for this success is the out-of-the-box usability and robustness of several popular methods to learn these linear metrics.
GSOC 2022 with ML4SCI
This blog is a brief summary of my work and results in GSOC 2022 under ML4SCI. During my time in GSOC I worked on the Fast Accurate Empirical Representation of Histograms project. The goal of this project is to use seq2seq models map histogram data to empirical functions. In particular we planned on adapting a transformer for this purpose. In order to train our model we will need a sufficiently diverse set of expressions and their corresponding histograms.
Learning Decision Trees from Histogram Data Using Multiple Subsets of Bins
Gurung, Ram B. (Stockholm University) | Lindgren, Tony (Stockholm University) | Boström, Henrik (Stockholm University)
The standard approach of learning decision trees from histogram data is to treat the bins as independent variables. However, as the underlying dependencies among the bins might not be completely exploited by this approach, an algorithm has been proposed for learning decision trees from histogram data by considering all bins simultaneously while partitioning examples at each node of the tree. Although the algorithm has been demonstrated to improve predictive performance, its computational complexity has turned out to be a major bottleneck, in particular for histograms with a large number of bins. In this paper, we propose instead a sliding window approach to select subsets of the bins to be considered simultaneously while partitioning examples. This significantly reduces the number of possible splits to consider, allowing for substantially larger histograms to be handled. We also propose to evaluate the original bins independently, in addition to evaluating the subsets of bins when performing splits. This ensures that the information obtained by treating bins simultaneously is an additional gain compared to what is considered by the standard approach. Results of experiments on applying the new algorithm to both synthetic and real world datasets demonstrate positive results in terms of predictive performance without excessive computational cost.
Non-linear Metric Learning
Kedem, Dor, Tyree, Stephen, Sha, Fei, Lanckriet, Gert R., Weinberger, Kilian Q.
In this paper, we introduce two novel metric learning algorithms, χ2-LMNN and GB-LMNN, which are explicitly designed to be non-linear and easy-to-use. The two approaches achieve this goal in fundamentally different ways: χ2-LMNN inherits the computational benefits of a linear mapping from linear metric learning, but uses a non-linear χ2-distance to explicitly capture similarities within histogram data sets; GB-LMNN applies gradient-boosting to learn non-linear mappings directly in function space and takes advantage of this approach's robustness, speed, parallelizability and insensitivity towards the single additional hyper-parameter. On various benchmark data sets, we demonstrate these methods not only match the current state-of-the-art in terms of kNN classification error, but in the case of χ2-LMNN, obtain best results in 19 out of 20 learning settings.